This is going to be a bit of a of an attempt to talk you through and train you at the same time on how to create a markdown report.
You run a markdown by ‘knit’ ing it. There is literally an icon with a ball of wool with a knitting needle in it. That is the button we need and it knits together all the markdown stuff and r code into one super snazzy output.
Will get you to knit the report and see the results and then we will pick apart the code and see how it is done behind the scenes.
If you have 2 screens try to have the code on one side and the report on the other and you can look side by side at what is happening.
I will also be presenting and you can look at me to follow as well.
How many times have you wanted to create a report that has the beauty and readability of a word document but want to add interactive graphs and functionality of a excel filtered table or even a pivot table?
If only there was a way you could pull your data, wrangle it and then finally get it to produce a beautiful report, all in one process.
If only…
If only…
Hang on! Sounds really complicated and difficult and need lots of tricky code. Have we not been telling you how brilliant R is?
Introducing
MarkdownIcon
You can import images, either from the web or from a local directory - this one is from the web and it is massive! You can see above you can also insert animated gifs.
(I fear what I have done to the NHS by releasing that knowledge into the wild - I take no responsibility for any repercussions of animated nyan cat reports.)
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
Word and PDF reports are fine and you can do some very pretty but flat reports in these outputs. Thats fine but the cool kats and kitties want interactive and flashy.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.
You may notice the nice floating contents page to the left. This is created in the YAML header at the very start of the document. YAML stands for YAML Ain’t Markup Language. It sets some global parameters and can do some cool stuff. You set the headings as per the next section and the document automatically picks up those with a single ‘#’.
So what can you do?
You can make a line make a line break
You can do all the normal italics and bold and superscript and strike-though
Things like block quotes look funky
especially if you put a few together
make sure you end your sentences with 2
spaces to start a new paragraph
Other things that are easy
You can also do numbered lists
Write some stuff in the middle of your list
You can colour in a block to assist with highlighting an area.
You can add footnotes1
You can also do basic dynamic tables just in markdown - looks messy in the code, pretty in the markdown and thanks to the power of HTML, they re-size according to the screen, all of these tables and charts are mobile friendly.
| Column heading one | Second column heading |
|---|---|
| Stuff | Column 2 stuff |
| Another row of stuff | More column 2 stuff |
You can align the stuff in these tables but they don’t work well with numbers
| Right | Left | Default | Centre |
|---|---|---|---|
| blurb right aligned | blurb left aligned | blurb default | blurb in da centre |
| 1234 | 56788 | 4545644 | 43535 |
| 34.2 | 3442.1 | 3421.1 | 4512.45 |
You can also play with HTML tags to change font to comic sans
Just because you can does not mean you should!
Tabsets are great for creating nice interactive documents and reducing the length of your document but still cramming in loads of extra data without your reader realising it.
If you look at the code you can see that you can add a little fade to give your report a little extra sparkle or you can leave it for extra zip. I am using a bit of fade sparkle for this example
Blah blah blah super important commentary about the blue line going up.
Oh my - the red line is going up, no one wants to see a red line going up. Must remember to RAG rate this later and add a sad smiley face, pretty sure there will be a lesson on how to do that later.
-insert witty commentary here-
That is the basic text formatting stuff, but we want to do clever things and incorporate our data
We are going to use the NHR-R sample data set and use the AE attendances data. This has a date, a organisation code, a type, number for attendances, admissions and breaches.
You can include data within the markdown specifically within the text by doing a back tick r and entering a variable or formula (you can also format it into a nice readable number).
For example the total number of attendances from the data set is 8,295,237
The date of 09 June 2021 is when you pressed the knit button report.
You can also add ‘code chunks’, these are chunks of code that sit within your markdown document.
So here is a the base way to include some data you simple do triple back ticks to tell R studio that now you want to do some R stuff.
This will print the summary of the AE attendnace* data set in a not very pretty table
* just noticed a typo in ‘attendance’ - R studio does have a spell check feature - remember to use it as not automatic, no wavy red lines like word!**
** you also have to do backslash asterisk when you actually want to use an asterisk
## # A tibble: 9 x 6
## period org_code type attendances breaches admissions
## <date> <fct> <fct> <dbl> <dbl> <dbl>
## 1 2019-03-01 RF4 other 10289 90 0
## 2 2019-02-01 RF4 other 9643 87 0
## 3 2019-01-01 RF4 other 10424 77 0
## 4 2018-12-01 RF4 other 9460 95 0
## 5 2018-11-01 RF4 other 8264 30 0
## 6 2018-10-01 RF4 other 7900 14 0
## 7 2018-09-01 RF4 other 7604 39 0
## 8 2018-08-01 RF4 other 7184 73 0
## 9 2018-07-01 RF4 other 5072 10 0
Basic but by no means pretty, we can use the kable package to make a prettier table, in just 2 lines of code you can produce this, which has better readability and a nice reading hover over.
| period | org_code | type | attendances | breaches | admissions |
|---|---|---|---|---|---|
| 2019-03-01 | RF4 | other | 10289 | 90 | 0 |
| 2019-02-01 | RF4 | other | 9643 | 87 | 0 |
| 2019-01-01 | RF4 | other | 10424 | 77 | 0 |
| 2018-12-01 | RF4 | other | 9460 | 95 | 0 |
| 2018-11-01 | RF4 | other | 8264 | 30 | 0 |
| 2018-10-01 | RF4 | other | 7900 | 14 | 0 |
| 2018-09-01 | RF4 | other | 7604 | 39 | 0 |
| 2018-08-01 | RF4 | other | 7184 | 73 | 0 |
| 2018-07-01 | RF4 | other | 5072 | 10 | 0 |
The kable package to make a prettier table, indent rows and colour it in and change formats, add sub headings and is quite simple to use.
| period | org_code | type | attendances | breaches | admissions |
|---|---|---|---|---|---|
| sub heading and indent first 3 rows | |||||
| 2019-03-01 | RF4 | other | 10289 | 90 | 0 |
| 2019-02-01 | RF4 | other | 9643 | 87 | 0 |
| 2019-01-01 | RF4 | other | 10424 | 77 | 0 |
| 2018-12-01 | RF4 | other | 9460 | 95 | 0 |
| 2018-11-01 | RF4 | other | 8264 | 30 | 0 |
| 2018-10-01 | RF4 | other | 7900 | 14 | 0 |
| 2018-09-01 | RF4 | other | 7604 | 39 | 0 |
| 2018-08-01 | RF4 | other | 7184 | 73 | 0 |
| 2018-07-01 | RF4 | other | 5072 | 10 | 0 |
Kable is really good for a static and pretty, but we want interactive.
This is a DT datatable - has lovely filters on the periods and different columns. You can re order the data and show as much or as little as you like. The export buttons export what you have selected. There is also a search function for the data.
You can mess around with the defaults so that it shows a larger table.
DT Datatable is pretty good and quick, even the base default comes up with a pretty good table. It is very nice at presenting a flat table, but if you want calculations and totals , you have to hard code them into your table.
That is one format or if you want more summary type reports you can use reactable.
This is reactable, and it works a little more like a filtered and sub grouped table. It is a bit trickier to use, but seems to be able to all the stuff of data table and a bit more.
Reactable is really good at creating summaries and drill down datasets. It creates all the groupings of data itself and so you do not have to wrangle the data into groups before you make the table. You can also create high level aggregate functions on a group and have the ability to drill down to see the underlying data.
Reactable has many more functions than a DT datatable but as I said, is a little more tricky to use.
You can use it to do funky stuff like this, I see your spark lines excel and raise you spark box plots! (and conditional formatting)
(Want to be super impressed? Hover over the spark box)
Of course we can add a plot, you can change the size, alignment and all of that stuff. You can wrap your text around a plot and potentially have plots side by size.
You need to switch to the code here for a little funky plotting short cut.
So that’s a plot, does what it says on the tin, but what is better than a static plot?
Interactive plots…
All is far more customisable within the plotly function, you can have multiple select-able data sets and all manner of other stuff.
Pretty cool but lets show off and animate
Obviously pretty pointless in this example, but may be good as a way to be more visual to get a point across.
If it gets people actually looking at the data and wanting their little wiggly lines going up that has to be good for patient care.
So how about we take the admissions data for the providers above and create a nice time series graph. This DY graph is nice for playing with time series as it allows you to zoom in on certain areas.
There is also a nice little box on the bottom left of the graph. This allows you to smooth your data with a rolling average on the fly. Really useful for things such as length of stay or things with lots of variablity and trying to pull out an overall trend.
Other good functions for visualising data sets.
One is treemap, it is like a posh pie chart for looking at proportions of a variable.
Obviously R can do pie charts, but I whereas I am willing to show you how add animated gifs into your reports, even I would not sink that low.
This is a nice overview of a large amount of data, gets a bit messy when you have a lot of factors
Another really cool thing to play with is a dendrogram which you can make with collapsibleTree.
This is really good at showing flow through pathways and systems. You can make them horizontal or vertical and play with all manner of bits on the nodes.
Click on the nodes and you can also zoom in and out and scroll around.
I also find it really relaxing for some reason.
Leaflet is a great mapping library and works with open street map so you don’t have to worry about google API tokens and the like. There are some fantastic things you can do with the google service which allows you to access travel times and route finding, however for simple mapping leaflet is great. You can do heatmaps and areas, draw lines across points and also add layers that you can select on and off.
This example has 3 teams that are set up as layers and youcan turn each one on and off.
The maps can be scrolled and zoomed, what is nice is the icons remain to scale.
Wordcloud2 is the sequel to wordcloud, much like Evil Dead 2 to the original, it is a far superior product, it has some really nice easy to use features and can make all manner of different wordclouds types.
However before you get to a word cloud you need some data which is basically a list of words and their frequency. You can do this manually on your fingers or you can get R to do this for you. I definately recommend the latter.
To get to that you read in some data, strip out all the gubbins such as punctuation, remove all the ‘stop words’ such as ‘the’ and ‘and’ etc and then remove white space and there you have a bunch of words fit for a cloud.
This is an example that pulls the text from a popular childrens novel and creates a cloud. Hopefully you can guess the book from the cloud.
You can hover over the words in the cloud and it will tell you the word and give you the number of the frequency.
Saving perhaps my favorite until last, is the super awesome Rpivottable.
This has full click and drag functionality as well the option to set up defaults in the report and also you can click through, create charts and heat maps, filter your data, calculate a pivot table with a median and just do all sorts of magic.
You can click and drag the variables around. You can click on the arrows to the side of the variables to filter them. You can click on the count to select a different metric and finally you can click on the table to change the results to a graph or heatmap or loads of things.
It doesn’t like super huge data sets if you are running it locally but if you get clever with shiny, you can do big things.
It also has a habbit of overlapping with stuff below it. I am working on a HTNL solution to this and I think this is a ‘feature’ that is being worked on by the developers. I usually just add it at the end of a report or on a seperate tab to get around this issue.
This whole functionality is done with one line of code. (!!!)
Please feel free to hack and steal share best practice from this report.
Some really nice visualisation tips can be found at
https://www.data-to-viz.com/caveats.html
and some more markdown tips at
https://holtzy.github.io/Pimp-my-rmd/
Other that I wish you well on your R journey and please do not hesiatte to contact me if you have found any interesting things to share
Merry markdowning
Contact Simon.Wellesley-Miller@nhs.net
This is the footnote from the footnote added way way up above↩︎